Planning Graph-based Heuristics for Cost-sensitive Temporal Planning

نویسندگان

  • Minh Binh Do
  • Subbarao Kambhampati
چکیده

Real world planners need to be sensitive to the quality of the plans they generate. Unlike classical planning where quality is often synonymous with plans having least number of actions, in temporal planning plan quality is multidimensional. It involves both temporal aspects of the plan (such as makespan, slack, tardiness) and execution cost aspects (such as cumulative action cost, resource consumption). Until now, most domain-independent temporal planners have concentrated solely on the former, ignoring the latter. In this paper, we consider the problem of developing heuristics that are sensitive to both makespan and cost, and develop a planning graph-based approach for this purpose. Our approach involves augmenting a (temporal) planning graph data structure with a mechanism to track the execution cost of the goals and subgoals. Since the cost of achieving a goal is dependent on the amount of available time, we need to track the cost of a literal as a function of time. We present a methodology for efficiently tracking the cost functions, and discuss how they can be used as the basis for deriving heuristics to support any objective function based on makespan and execution cost. We demonstrate the effectiveness of this general method for deriving costand makespan-sensitive heuristics in the context of Sapa a forward chaining planner for metric temporal domains that we have been developing. A version of Sapa using a subset of the techniques discussed in this paper was one of the best domain independent planners for domains with metric and temporal constraints in the third International Planning Competition, held at AIPS-02. Introduction Of late, there has been increased interest in the planning community to leverage the successes in heuristic control of classical planners to tackle the more realistic metric temporal planning problems. Developing heuristics for metric temporal planning is complicated by the multi-objective nature of the problem. In contrast to classical planning, where the heuristics need only be sensitive to the “length” of the plans, in metric temporal planning, the user may be interested in improving either temporal quality of the plan (e.g. We thank David E. Smith and the AIPS reviewers for useful comments on the earlier drafts of this paper. This research is supported in part by the NSF grant IRI-9801676, and the NASA grants NAG2-1461 and NCC-1225. Copyright c 2002, American Association for Artificial Intelligence (www.aaai.org). All rights reserved. makespan) or its cost (e.g. cumulative action cost, cost of resources consumed etc.), or more generally, a combination there of.1 Consequently, effective plan synthesis requires heuristics that are able to track both these aspects of an evolving plan. Until now, most domain-independent temporal planners have concentrated solely on optimizing makespan (c.f. (Haslum & Geffner 2001; Smith & Weld 1999; Do & Kambhampati 2001)), a temporal aspect of plan quality. Consequently, most heuristics for temporal planners are only sensitive to temporal aspects (more specifically, just to makespan). We are interested in developing heuristics that are sensitive to both temporal and cost aspects of plan quality, so that we can effectively handle the multi-objective nature of temporal planning. An important challenge here, as illustrated by the example below, is that the cost and temporal aspects of a plan are often inter-dependent: Example: Suppose we need to go from Tucson to Los Angeles. The two common options are: (1) rent a car and drive from Tucson to Los Angeles in one day for $100 or (2) take a shuttle to the Phoenix airport and fly to Los Angeles in 3 hours for $200. The first option takes more time (higher makespan) but less money, while the second one clearly takes less time but is more expensive. Depending on the specific weights the user gives to each criterion, she may prefer the first option over the second or vice versa. Moreover, the user’s decision may also be influenced by other constraints on time and cost that are imposed over the final plan. For example, if she needs to be in Los Angeles in six hours, then she may be forced to choose the second option. However, if she has plenty of time but limited budget, then she may choose the first option. The simple example above shows that makespan and execution cost, while nominally independent of each other, are nevertheless related in terms of the overall objective of the user and the constraints on a given planning problem. More specifically, for a given makespan threshold (such as to be in LA within six hours), there is a certain estimated solution cost tied to it (shuttle fee and ticket price to LA) and vice versa. Thus, in order to find plans that are good with respect Another dimension of optimization involves execution flexibility (e.g. slack, latency etc.). In the current paper, we ignore this dimension and concentrate on cost and make-span tradeoffs. to both cost and makespan, we need heuristics that track cost of a set of (sub)goals as a function of time. Since the cost of achieving a goal is dependent on the amount of available time (see the example above), we introduce an approach to track the cost of a literal as a function of time. (Figure 3, to be discussed later, shows the cost functions for subgoals in an extended version of the travel example). Specifically, the cost incurred to achieve facts and to execute actions are estimated by cost propagation while building the temporal planning graph. These cost functions can in turn be used to estimate the achievement cost of a set of goals for a given makespan bound and vice versa. We present the methodology for efficiently maintaining the cost functions, and discuss how these time-sensitive cost functions can be used as the basis for deriving heuristics to support any objective function based on makespan and execution cost. Finally, we empirically demonstrate the effectiveness of our heuristics in generating plans that offer a variety of cost-makespan tradeoffs. Our experiments are done with Sapa, a forward chaining planner for metric temporal domains that we have been developing (Do & Kambhampati 2001). A version of Sapa using a subset of the techniques discussed in this paper was one of the best domain independent planners for domains with metric and temporal constraints in the third International Planning Competition, held at AIPS02. In fact, it is the best planner in terms of solution quality and number of problems solved in the highest level of PDDL2.1 setting used in the competition for the two domains Satellite and Rovers, which are inspired by real-world applications being investigated by NASA. The paper is organized as follows: first we describe the temporal planning problem and the action representation that we assume. Next, we discuss the problem of how to build a temporal planning graph and use it to propagate the cost information. The next two sections show how the propagated information can be used to estimate the cost of achieving the goals from a given state. Then, we discuss how the mutual exclusion relations can help to improve the heuristic estimation. We continue with sections on empirical results of using our heuristics in Sapa. We conclude the paper with a discussion on related work, the conclusion and the future work. Action Representation This section provides the background on the action representation and different types of constraints in the the temporal planning problems. Unlike actions in classical planning, in planning problems with temporal and resource constraints, actions are not instantaneous but have durations. Their preconditions may either be instantaneous or durative and their effects may occur at any time point during their execution. Each action A has a duration , starting time , and end time ( ). The value of can be statically defined for a domain, statically defined for a particular planning problem, or can be dynamically decided at the time of execution. Action A has preconditions that may be required either to be instantaneously true at the time point , or required to be true starting at and remain true for some duration . The logical effects Eff(A) of A are instantaneous and occur at time points ( ! " # " $ ). If &%' then they are called delayed effects as their onset is delayed with respect to action start time. Actions can also consume or produce metric resources and their preconditions may also well depend on the value of the corresponding resource. Each action is associated with the cost value, which represents the total money we need to spend to execute that action. We shall now illustrate the action representation in a simple temporal planning problem. This problem, which is an extended version of the example we introduced in the introduction, will be used as the running example through out the rest of the paper. Figure 1 shows graphically the problem description. In this problem, a group of students in Tucson need to go to Los Angeles (LA). There are two car rental companies in Tucson. If the students rent a car from the first company, which has faster but more expensive cars (Car1) , they can only go to Phoenix (PHX) or Las Vegas (LV). However, if they decide to rent a car from the second company (Car2), which is slower but cheaper, then they can use it to drive to Phoenix or directly to LA. Moreover, to reach LA, the students can also take a train from LV or a flight from PHX. In total, there are 6 actions in the domain: drive-car1-tucsonphoenix ( )(+* , . ), drive-car1-tucson-lv ( )(+* , /10 ), drive-car2tucson-phoenix ( (32 , . ), drive-car2-tucson-la ( (32 , /54 ), flyairplane-phoenix-la ( 6 .7-8/54 ), and use-train-lv-la ( 9 /10:/54 ). Each move (by car/airplane/train) action between two cities ; and < requires the precondition that the students should be at ; ( =?>: @;A )at the beginning of . There are also two temporal effects: BC=?>: @;A occurs at the starting time point of and =D>: E<F at the end time point of . The durations and execution cost values for six actions described above are shown in the right side of Figure 1. In this case, the costs of moving by train or airplane are the respective ticket prices, and the costs of moving by rental cars include the rental fees and gas (resource) costs. Propagating cost information To measure the heuristic distance of a given state to the goals, we need to estimate how costly it is to achieve the goals from that state. All we know is that facts in the initial states have zero costs and that each action has some execution cost. Thus, to evaluate the cost of a set of goals from a given state, we need to propagate the costs from the initial state to the goals using the mutual dependencies between facts and actions. Specifically, the cost to achieve a fact depends on the cost to execute the actions supporting it, which in turn depends on the costs to achieve facts that are their preconditions. Given that the planning graph is an excellent structure to represent the relation between facts and actions, we will use the temporal planning graph structure (TGP(Smith & Weld 1999)) as a substrate for propagating the costs information. In this section, we start with a brief discussion of the data structures used for the cost propagation process. We then continue with the details of the propagation process and the

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sapa: A Scalable Multi-objective Heuristic Metric Temporal Planner

In this research paper, we discuss Sapa, a domain-independent heuristic forward chaining planner that can handle durative actions, metric resource constraints, and deadline goals. It is designed to be capable of handling the multi-objective nature of the metric temporal planning. Our technical contributions include discussion of (i) various objective functions for planning and the multi-objecti...

متن کامل

Planning graph based heuristics for Partial Satisfaction Problems

In many real world planning scenarios, agents often do not have enough resources to achieve all of their goals. Hence, this requires finding plans that satisfy only a subset of them. Solving such partial satisfaction planning (PSP) problems poses several challenges, including an increased emphasis on modelling and handling plan quality (in terms of action costs and goal utilities). Despite the ...

متن کامل

SAPA: A Multi-objective Metric Temporal Planner

Sapa is a domain-independent heuristic forward chaining planner that can handle durative actions, metric resource constraints, and deadline goals. It is designed to be capable of handling the multi-objective nature of metric temporal planning. Our technical contributions include (i) planning-graph based methods for deriving heuristics that are sensitive to both cost and makespan (ii) techniques...

متن کامل

Probabilistic Temporal Planning

Planning research has explored the issues that arise when planning with concurrent and durative actions. Separately, planners that can cope with probabilistic effects have also been created. However, few attempts have been made to combine both probabilistic effects and concurrent durative actions into a single planner. The principal one of which we are aware was targeted at a specific domain. W...

متن کامل

Cost-Sensitive Reachability Heuristics for Probabilistic Planning

Reachability heuristics have lead to impressive scale-ups in deterministic planning making their application to probabilistic planning a promising research direction. We describe how one such reachability heuristic (based on planning graphs) can be extended to handle a class of cost-sensitive probabilistic planning problems. Specifically, we address the problem of conformant (non-observable) pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002